AITopics | loss discrepancy

Collaborating Authors

loss discrepancy

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Beyond Effi ciency: Molecular Data Pruning for Enhanced Generalization

Neural Information Processing SystemsOct-9-2025, 20:39:10 GMT

data pruning, dataset, pruning, (14 more...)

Neural Information Processing Systems

Country:

Asia > China > Liaoning Province > Shenyang (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.93)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.49)
Health & Medicine > Therapeutic Area > Immunology (0.49)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Beyond Efficiency: Molecular Data Pruning for Enhanced Generalization

Chen, Dingshuo, Li, Zhixun, Ni, Yuyan, Zhang, Guibin, Wang, Ding, Liu, Qiang, Wu, Shu, Yu, Jeffrey Xu, Wang, Liang

arXiv.org Artificial IntelligenceSep-2-2024

With the emergence of various molecular tasks and massive datasets, how to perform efficient training has become an urgent yet under-explored issue in the area. Data pruning (DP), as an oft-stated approach to saving training burdens, filters out less influential samples to form a coreset for training. However, the increasing reliance on pretrained models for molecular tasks renders traditional in-domain DP methods incompatible. Therefore, we propose a Molecular data Pruning framework for enhanced Generalization (MolPeg), which focuses on the source-free data pruning scenario, where data pruning is applied with pretrained models. By maintaining two models with different updating paces during training, we introduce a novel scoring function to measure the informativeness of samples based on the loss discrepancy. As a plug-and-play framework, MolPeg realizes the perception of both source and target domain and consistently outperforms existing DP methods across four downstream tasks. Remarkably, it can surpass the performance obtained from full-dataset training, even when pruning up to 60-70% of the data on HIV and PCBA dataset. Our work suggests that the discovery of effective data-pruning metrics could provide a viable path to both enhanced efficiency and superior generalization in transfer learning.

data pruning, dataset, pruning, (14 more...)

arXiv.org Artificial Intelligence

2409.01081

Country:

Asia > China > Liaoning Province > Shenyang (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.89)
Health & Medicine > Therapeutic Area > Immunology (0.89)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Noise Induces Loss Discrepancy Across Groups for Linear Regression

Khani, Fereshte, Liang, Percy

arXiv.org Machine LearningNov-22-2019

This loss discrepancy across groups is especially problematic in critical applications that impact people's lives (Berk, 2012; Chouldechova, 2017). Despite the vast literature on removing loss discrepancy (Hardt et al., 2016; Khani et al., 2019; Agarwal et al., 2018; Zafar et al., 2017), the direct removal of loss discrepancy might introduce other problems such as intragroup loss discrepancy (Lipton et al., 2018) and adverse long-term impacts (Liu et al., 2018). Therefore, it is important to understand the source of loss discrepancy. Why do such loss discrepancies exist? The literature generally studies sources of loss discrepancy due to an "information deficiency" of one group--that is, one group has, for example, more noise (Corbett-Davies et al., 2017), lessPreliminary work, under review.

discrepancy, loss discrepancy, sld, (13 more...)

arXiv.org Machine Learning

1911.09876

Country: North America > United States > California > Santa Clara County > Palo Alto (0.04)

Genre: Research Report (0.50)

Industry:

Law (0.93)
Education > Educational Setting (0.68)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.41)

Add feedback

Maximum Weighted Loss Discrepancy

Khani, Fereshte, Raghunathan, Aditi, Liang, Percy

arXiv.org Machine LearningJun-8-2019

Though machine learning algorithms excel at minimizing the average loss over a population, this might lead to large discrepancies between the losses across groups within the population. To capture this inequality, we introduce and study a notion we call maximum weighted loss discrepancy (MWLD), the maximum (weighted) difference between the loss of a group and the loss of the population. We relate MWLD to group fairness notions and robustness to demographic shifts. We then show MWLD satisfies the following three properties: 1) It is statistically impossible to estimate MWLD when all groups have equal weights. 2) For a particular family of weighting functions, we can estimate MWLD efficiently. 3) MWLD is related to loss variance, a quantity that arises in generalization bounds. We estimate MWLD with different weighting functions on four common datasets from the fairness literature. We finally show that loss variance regularization can halve the loss variance of a classifier and hence reduce MWLD without suffering a significant drop in accuracy.

artificial intelligence, machine learning, mwld, (16 more...)

arXiv.org Machine Learning

1906.03518

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > Florida > Broward County (0.04)

Genre: Research Report (0.50)

Industry: Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback